Genealogical Record Linkage: Features for Automated Person Matching

نویسنده

  • D. Randall Wilson
چکیده

This paper provides a high-level overview of how automatic person matching (genealogical record linkage) algorithms can be developed, and then provides a detailed explanation of many of the features used by FamilySearch in doing person matching. Empirical results show a dramatic improvement in accuracy by using these features trained with neural networks, when compared to traditional probabilistic record linkage with simple field agreement features.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Record Linkage for Genealogical Databases

In this paper we describe past experience and outline current directions in performing record linkage over large genealogical databases. 1. INTRODUCTION AND MOTIVATION Record linkage is the problem of identifying multiple records that refer to the same real-world entity. In genealogical databases, it is the problem of identifying when individuals situated in different pedigrees refer to the sam...

متن کامل

Utilizing Stacking for Feature Reduction in Graph-Based Genealogical Record Linkage

Genealogy research is centered on collecting records about an individual from various sources and combining the information to gain a larger historical perspective about that individual, commonly in the form of a pedigree. Data extraction, the internet, and other technological advancements have made large amounts of digital genealogical data more accessible. Discovering the relevancy of a digit...

متن کامل

Probabilistic Record Linkage for Genealogical Research

The most slow and tedious job in genealogical research is searching civil or church records for information about an individual. But, this is an essential step in research. By searching multiple sources such as census records, wills, deeds, birth and death records we can compile a more complete set of information, and potentially the pedigree of an individual. When records are stored electronic...

متن کامل

Building a Life Course Dataset from Australian Convict Records: Founders & Survivors: Australian Life Courses in Historical Context, 1803-1920

Founders & Survivors is a multi-university and public collaborative project that is building a transnational and inter-generational dataset of life courses generated from the UNESCO recognized convict records of Tasmania. This paper outlines the technical history of the project: mass digitization and archiving online of over 100,000 images; manual scholarly transcription; TEI standard XML data ...

متن کامل

A Comparison and Analysis of Name Matching Algorithms

Names are important in many societies, even in technologically oriented ones which use e.g. ID systems to identify individual people. Names such as surnames are the most important as they are used in many processes, such as identifying of people and genealogical research. On the other hand variation of names can be a major problem for the identification and search for people, e.g. web search or...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010